Virtue: Performance Visualization of Parallel and Distributed Applications

نویسندگان

  • Eric Shaffer
  • Daniel A. Reed
  • Shannon Whitmore
  • Benjamin Schaeffer
چکیده

44 Computer H igh-speed, wide-area networks have made it both possible and desirable to interconnect geographically distributed applications that control distributed collections of scientific data, remote scientific instruments, and highperformance computer systems. Such an application might, for example, control a remote radio telescope, transmit raw data from the telescope site to a distributed data archive, and concurrently convolve the data to create images for real-time visualization. Developing just such a distributed application infrastructure is the goal of our partners in the National Computational Science Alliance, one of the NSF Partnerships for an Advanced Computational Infrastructure. Although interconnecting these applications enables geographically distributed science and engineering teams to collaborate in new ways, the resultant distributed computations pose significant performance analysis and optimization challenges. First, the execution environments of geographically distributed applications are far less deterministic than those of locally distributed, parallel applications.1 Network bandwidths and latencies, computing resources, and available data repositories can vary from one execution to another, and even during a single execution. Consequently, identifying and correcting performance bottlenecks exposed during one execution may not benefit later executions. Second, distributed applications are highly complex. Application components execute atop disparate system software and hardware. Real-time instruments often impose scheduling and access constraints. Accessing data repositories sometimes necessitates data translation for correlation with experimental or computational data. Finally, enabling effective remote interaction and visualization necessitates quality-ofservice (QoS) guarantees. Incorporating QoS into these hardware and software systems further increases their complexity. Historically, performance analysis has focused on monolithic applications executing on large, standalone, parallel systems. In such a domain, measurement, postmortem analysis, and code optimization suffice to eliminate performance bottlenecks and optimize applications. Most existing performance analysis systems—for example, SvPablo,2 Medea,3 and Paragraph4—use only postmortem analysis. To tune the emerging distributed applications, however, a new generation of online performance measurement and optimization tools must adapt application behavior dynamically as resource availability changes. In addition to providing real-time adaptive control, new performance tools must gather data from multiple sources and software levels (application, library, system, and network). Furthermore, these tools must enable geographically dispersed teams to collaborate in identifying and correcting performance problems. This capability requires support of distributed visualization and control, as well as support of both synchronous and asynchronous collaboration. The Virtue prototype exploits human sensory capabilities to help performance analysts explore and optimize large-scale, multidisciplinary applications. The visualization environment lets collaborators interact with executing software, tuning its behavior to meet performance goals. Cover Feature Virtue: Performance Visualization of Parallel and Distributed Applications

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Flexible performance visualization of parallel and distributed applications

Performance debugging of parallel and distributed applications can benefit from behavioral visualization tools helping to capture the dynamics of the executions of applications. The Pajé generic tool presented in this article provides interactive and scalable behavioral visualizations; because of its genericity, it can be used unchanged in a large variety of contexts. © 2002 Elsevier Science B....

متن کامل

Visualization of Parallel Execution Graphs

Measuring and evaluating the runtime of parallel programs is a diicult task. In this paper we present tools for performance evaluation and visualization in the distributed thread system (DTS), a programming environment for portable parallel applications. We describe the visualization of a parallel trace log as an execution graph using a novel layout algorithm which has been tailored to expose t...

متن کامل

A Parallel Debugger with Support for Distributed Arrays, Multiple Executables and Dynamic Processes

In this paper we present the parallel debugger DETOP with special emphasis on new support for debugging of programs with distributed data structures such as arrays that have been partitioned over a number of processors. The new array visualizer within DETOP supports transparent browsing and visualization of distributed arrays which occur in languages such as High Performance Fortran. Visualizat...

متن کامل

A Steering and Visualization Toolkit for Distributed Applications

Parallel and high performance computing has enabled great strides to be made in advancing science and solving large problems. However, this progress is limited by the lack of needed tools and the difficulty of programming and running parallel applications. Specifically, there is a lack of needed steering and visualization tools, which can be easily integrated in existing applications. This pape...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEEE Computer

دوره 32  شماره 

صفحات  -

تاریخ انتشار 1999